Automated Epilepsy Diagnosis Using Interictal Scalp EEG 
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Abstract — Approximately over 50 million people worldwide 
suffer from epilepsy. Traditional diagnosis of epilepsy relies on 
tedious visual screening by highly trained clinicians from lengthy 
EEG recording that contains the presence of seizure (ictal) 
activities. Nowadays, there are many automatic systems that 
can recognize seizure-related EEG signals to help the diagnosis. 
However, it is very costly and inconvenient to obtain long-term 
EEG data with seizure activities, especially in areas short of 
medical resources. We demonstrate in this paper that we can 
use the interictal scalp EEG data, which is much easier to 
collect than the ictal data, to automatically diagnose whether 
a person is epileptic. In our automated EEG recognition system, 
we extract three classes of features from the EEG data and build 
Probabilistic Neural Networks (PNNs) fed with these features. We 
optimize the feature extraction parameters and combine these 
PNNs through a voting mechanism. As a result, our system 
achieves an impressive 94.07% accuracy, which is very close to 
reported human recognition accuracy by experienced medical 
professionals. 



I. Introduction 

EPILEPSY is the second most common neurological dis- 
order, affecting 1% of world population [1]. Eighty- 
five percent of patients with epilepsy live in the developing 
countries [2]. Electroencephalogram (EEG) is routinely used 
clinically to diagnose epilepsy [3]. Long-term video-EEG 
monitoring can provide 90% positive diagnostic informa- 
tion [4] and it has become the golden standard in epilepsy 
diagnosis. For the purpose of this research, we define the term 
"the diagnosis of epilepsy" as the determination of whether a 
person is epileptic or non-epileptic [5]. 

Traditional diagnostic methods rely on experts to visually 
inspect lengthy EEG recordings, which is time consuming and 
problematic due to the lack of clear differences in EEG activity 
between epileptic and non-epileptic seizures [6], particularly in 
seizures of frontal origin. Many automated seizure recognition 
techniques, therefore, have emerged [6]-[17]. The approach 
of using automatic seizure recognition/detection algorithms 
would still require the recording of clinical seizures. There- 
fore, very long continuous EEG recording, preferably with 
synchronized video for several days or weeks, are needed to 
capture the seizures. The long-term EEG recording can greatly 
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disturb patients' daily lives. Another clinical concern is that 
very unfortunately, 50-75% of epilepsy patients in the world 
reside in areas which lack the medical resources and trained 
clinicians, that are needed to make such a process feasible [2]. 
Consequently, an automated EEG epilepsy diagnostic system 
would be very valuable if it does not require data containing 
seizure activities (i.e., ictal) to arrive at the diagnosis. However, 
to the authors' best knowledge, we are not aware of any report 
on automated epilepsy diagnostic system using only interictal 
scalp EEG data. 

Previous research has also attempted at creating automated 
epilepsy diagnostic systems using interictal EEG data [13], 
[18]. However, in those trials, only intracranial EEG data 
from patients are used, and the EEG artifacts have been 
carefully removed manually. It is very expensive to obtain 
intracranial EEG recordings that are relatively artifact free for 
every epilepsy patient, which is especially impractical in poor 
and rural areas. Therefore, we have built an automated epilepsy 
diagnostic system with very good accuracy that can work with 
scalp EEG data that contain noise and artifacts. 

Artificial Neural Network (ANN) has been used for seizure- 
related EEG recognition [10]-[15].We use in this work one 
kind of ANN as the classifier, namely the Probabilistic Neural 
Network (PNN), for its high speed, high accuracy and real- 
time property in updating network structure [19]. It is very 
difficult to directly use raw EEG data as the input of an 
ANN [20]. Therefore, the key is to parameterize the EEG 
data into features prior to the input into the ANN. We use 
features that are used in previous studies on seizure-related 
EEG, namely, the power spectral feature, fractal dimensions 
and Hjorth parameters. A simple classifier voting scheme [21] 
and parameter optimization are used to improve the accuracy. 
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Fig. 1. Flow diagram of our EEG classification scheme 



The final accuracy of our system on distinguishing interictal 
scalp EEG of epileptic patients vs. the scalp EEG of healthy 
people is 94.07%, which is very close to currently reported 
human diagnosis accuracy [22]. 



II. Data Acquisition 

We compose a data set based on 22-channel routine scalp 
EEG recordings from Dept. of Neurology, Jiangsu Provincial 
Hospital of Chinese Medicine, China. The data is from 6 
normal people and 6 epileptic patients (in interictal period 
only). It is recorded at 200 Hz sampling rate using the standard 
international 10-20 system with referential montage. Whereas 
other research [13] EEG recordings are cut into segments of 
4096 (i.e., 2^^), our complete data set has 22,353 segments 
per channel, and 491,766 segments in total. 



III. Feature Extraction 

Three classes of features are extracted to characterize EEG 
signal: Power Spectral Features, describing its energy distribu- 
tion in the frequency domain. Fractal Dimensions outlining its 
fractal property, and Hjorth Parameters, modeling its chaotic 
behavior. 



A. Power Spectral Features 

As one can see from Fig. O power spectrum is a good way 
to distinguish different kinds of EEG signals. 

To a time series xi,X2,--- jXn, its Fast Fourier Transform 
(EFT) Xi,X2, • • • is estimated as 



1 —jlnkn 

, where Wj^^ = e^^ and N is the series length. 
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Fig. 2. Typical EFT results of 3 EEG segments (Raw data in /iV) 

Based on the EFT result. Power Spectral Intensity (PSI) of 
each /stepHz bin in a given band /iow-/upHz is evaluated as 



k=l,2,"',K 



(1) 



, where /min = 2k, /max = 2k^2, K= (/up - /low) //step, /s is 
the sampling rate and N is the series length, /min and /max are 
the lower and upper boundaries of each bin, respectively. 

We use Relative Intensity Ratio (RIR) as the Power Spectral 
Features. It is defined as 
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B. Petrosian Fractal Dimension (PFD) 
PED is defined as: 

logioA^ 



PFD = 



logioA^ + logio(^^+^) 



, where N is the series length and A^^ is the number of sign 
changes in the signal derivative [23]. 

C. Higuchi Fractal Dimension (HFD) 

Higuchi's algorithm [24] constructs k new series from the 
original series xi,X2, • • • ^xn by 

(2) 



, where m = 1,2, •• • 

For each time series constructed from (|2l), the length L{m^k) 
is computed by 



L{m^k) 
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The average length L{k) is computed as 

m = ^^^ 

This procedure repeats kmax times for each k from 1 to kmax, 
and then uses a least- square method to determine the slope of 
the line that best fits the curve of ln{L{k)) versus ln(l/^). The 
slope is the Higuchi Fractal Dimension. In this paper, kmax = 5. 

D. Hjorth Parameters 

To a time series xi,X2, • • • ,Xiv, the Hjorth mobility and 
complexity [25] are respectively defined as 



, where TP 
and di = Xi - 
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IV. Probabilistic Neural Network 

In machine learning, a classifier is essentially a mapping 
from the feature space to the class space. An Artificial Neural 
Network (ANN) implements such a mapping by using a group 
of interconnected artificial neurons simulating human brain. 
An ANN can be trained to achieve expected classification 
results against the input and output information stream, such 
that there is not a need to provide a specified classification 
algorithm. 

PNN is one kind of distance-based ANNs, using a bell- 
shape activation function. Compared with traditional back- 
propagation (BP) neural network, PNN is considered more 
suitable to medical application since it uses Bayesian strategy, 
a process familiar to medical decision makers [26]. Decision 
boundaries of PNN can be modified in real-time as new data 
becomes available [19]. There is no need to train the network 
over the entire data set again. We can therefore quickly 
update our network as more and more patients' data becomes 
available. 
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Fig. 3. PNN structure, R: number of features, Q: number of training samples, 
K: number of classes. The input vector p is presented as a black vertical bar. 



Our PNN has three layers: the Input Layer, the Radial Basis 
Layer which evaluates distances between input vector and rows 
in weight matrix, and the Competitive Layer which determines 
the class with maximum probability to be correct. The network 
structure is illustrated in Fig. [S] Dimensions of matrices are 
marked under their names. 

A. Radial Basis Layer 

In Radial Basis Layer, the vector distances between input 
vector p and the weight vector made of each row of weight 
matrix W are calculated. Here, the vector distance is defined 
as the dot product between two vectors [19]. The dot product 
between p and the /-th row of W produces the /-th element 
of the distance vector matrix, denoted as ||W — p||. The bias 
vector b is then combined with ||W — p|| by an element-by- 
element multiplication, represented as "-x" in Fig. [3] The 
result is denoted as n = | |W — p| | • xb. 

The transfer function in PNN has built into a distance 
criterion with respect to a center. In this paper, we define it as 

radbas(^) = (3) 

Each element of n is substituted into © and produces cor- 
responding element of a, the output vector of Radial Basis 
Layer. We can represent the /-th element of a as 



a,- =radbas(||W—p|| • xb,- 



(4) 



, where is the /-th row of W and b, is the /-th element of 
bias vector b. 

1) Radial Basis Layer Weights: Each row of W is the 
feature vector of one trainging sample. The number of rows 
equals to the number of training samples. 

2) Radial Basis Layer Biases: All biases in radial basis 
layer are set to ^/\nQ.5/s resulting in radial basis functions 
that cross 0.5 at weighted inputs of ^s, where s is the spread 
constant of PNN. According to our experience, ^ = 0.1 can 
result in the highest accuracy. 

B. Competitive Layer 

There is no bias in Competitive Layer. In this layer, the 
vector a is first multiplied by layer weight matrix M, producing 
an output vector d. The competitive function C produces a 1 



corresponding to the largest element of d, and O's elsewhere. 
The index of the 1 is the class of the EEG segment. M is set 
to ^ X 2 matrix of Q target class vectors. If the /-th sample 
in training set is of class 7, then we have a 1 on the 7-th row 
of /-th column of M. 



V. Combining Classifiers Using Voting 

A simple voting scheme [21] is used to improve the classi- 
fication accuracy in this paper. We first build one component 
classifier for each channel and then combine them as follows. 
Given 22 segments collected at the same time (from different 
channels), each of them will be classified by the component 
classifier for the same channel. The component classifier of 
each channel will judge whether the given EEG segment is 
epileptic. The final classification decision is based on the vote 
of each component classifier. The voting rule we use here is the 
majority rule. Fig. |4] shows the diagram on how the combined 
classifiers work. 
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Fig. 4. Classification Voting Scheme 



VI. Experimental Results 

In the experiments, we use MATLAB Neural Network 
Toolbox to implement PNN. The data used in the experiments 
is labeled as interictal (positive) or healthy (negative). The 
interictal data set has the same size as the healthy one. The 
testing method for PNN is Leave-One-Out Cross-Validation 
(LOOCV) [21], where exactly one sample is used as the test 
sample while all the rest as training samples and such process 
repeats until every sample has been used as a test sample for 
exactly once. 

We notice that different parameters used in feature extrac- 
tion can lead to different classifier performance. We will show 
the experimental results using default feature extraction param- 
eters in the first section while using optimized parameters in 
the second section. 
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Single Channel Classification Accuracy Using PNN 
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A. Classification using default feature extraction parameters 

The features are extracted using the default parameters 
described in Sec. HIl We have carried out experiments to 
find the best features to be used for classification. We use 
all possible combinations of these features to build the PNN 
classifier: RIRs, Fractal Dimension (FDs) and Hjorth parame- 
ters (Hjorth's). The performance of each PNN with a specific 
combination of features is tested using LOOCV against each 
channel. The results are listed in Table U where each entry is 
the accuracy of LOOCV of the PNN with the features for that 
column against the data set of the channel corresponding to 
that row. 

From Table II it is clear that the first feature combination 
(using all features) yields the highest accuracy, and thus we 
decide to use all extracted features in later experiments to build 
the classifiers. 

The accuracy of the combined classifiers increases to 
84.27% while the true and false positive rates increase to 
85.36% and 83.18% respectively. Thus, the sensitivity and 
specificity are 83.33% and 84.69%, respectively. 

B. Optimizing feature extraction parameters 

In Sec. nil and Sec. |llll there are some parameters that 
can be changed: the segment length of EEG signal, the cut- 
off frequency of filters, and thebin(/step) and band (/low and 
/up) in Eq. ([T]). A combination of those parameters is called 
a configuration. In this subsection, we will show that such 
configuration is important to the classification. Optimized 
configuration can lead to better accuracy. Different feature 
extraction parameters used in this paper are listed in Table 

Table [Till shows accuracies of combined PNN based classi- 
fiers in different configurations. The cut-off frequency of 56 
and 66 Hz are not tested for segment length 4096, because 



TABLE II 

Feature Extraction Parameters Used In This Paper 

Parameters Values 
segment length 4096 or 8192 samples 

cut-off frequency of filters 40, 46, 56 or 66 Hz 

band: 2-32 Hz, bin:l Hz 
spectral band and bin band: 2-34 Hz, bin: 2 Hz 

band: 2-34.5 Hz, bin: 2.5 Hz 



TABLE III 

Accuracy of Voted Classifier (PNN) in Different 
Configurations 



Length 


cut-off freq. 


band and bin (/low/up, /step) 


2-32, 1 2-34, 2 2-34.5, 2.5 


4096 


40 


86.41 84.27 83.41 


46 


91.77 89.81 89.23 
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90.19 87.80 86.86 
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93.73 91.93 91.92 


56 


94.07 92.14 91.37 


66 


93.78 91.96 91.13 



we find longer segmentation can give higher accuracy. An 
interesting finding is that after the filter cut-off frequency 
reaches 46Hz, the accuracy does not significantly increase. 
One possible explanation is that many spikes may exist in 
interictal EEG and most reside in a frequency range of 15 
to 50 Hz. Increasing the filter cut-off frequency may also 
introduce line noise from power supply or other sources, which 
will not benefit EEG signal quality [27]. Table V shows the 
highest accuracy is 94.07%, which is almost the same as the 
reported epilepsy diagnosis accuracy by human in a medical 
journal [22]. 

VII. Conclusions 

In this paper, an automated interictal scalp EEG recognition 
system for epilepsy diagnosis is developed and validated. 
Three classes of features are extracted and PNNs are employed 
to make classification using those features. To improve the 
accuracy, we optimize the feature extraction parameters and 
design a final classifier that combines several PNN-based clas- 
sifiers. Our system can reach an accuracy of 94.07%, which is 
very close to the accuracy achieve by human. Compared with 
the existing approaches on epilepsy diagnosis, our approach 
does not require the occurrence of seizure activity during EEG 
recording. This merit reduces the difficulties in EEG collection 
since interictal data is much easier to be collected than ictal 
data. Therefore, our system is very helpful for areas short of 
medical resources. 
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